Correlation-Based Refinement of Rules with Numerical Attributes
نویسندگان
چکیده
Learning rules is a common way of extracting useful information from knowledge or data bases. Many of such data sets contain numerical attributes. However, approaches like Inductive Logic Programming (ILP) or association rule mining are optimized for data with categorical values, and considering numerical attributes is expensive. In this paper, we present an extension to the top-down ILP algorithm, which enables an efficient discovery of datalog rules from data with both numerical and categorical attributes. Our approach comprises a preprocessing phase for computing the correlations between numerical and categorical attributes, as well as an extension to the ILP refinement step, which enables us to detect interesting candidate rules and to suggest refinements with relevant attribute combinations. We report on experiments with U.S. Census data, Freebase and DBpedia, and show that our approach helps to efficiently discover rules with numerical intervals.
منابع مشابه
Action Rules Discovery Based on Tree Classifiers and Meta-actions
Action rules describe possible transitions of objects from one state to another with respect to a distinguished attribute. Early research on action rule discovery usually required the extraction of classification rules before constructing any action rule. Newest algorithms discover action rules directly from a decision system. To our knowledge, all these algorithms assume that all attributes ar...
متن کاملDerived fuzzy importance of attributes based on the weakest triangular norm-based fuzzy arithmetic and applications to the hotel services
The correlation between the performance of attributes and the overallsatisfaction such as they are perceived by the customers is often used tocalculate the importance of attributes in the crisp case. Recently, the methodwas extended, based on the standard Zadeh extension principle, to the fuzzycase, taking into account the specificity of the human thinking. Thedifficulties of calculation are im...
متن کاملFaults and fractures detection in 2D seismic data based on principal component analysis
Various approached have been introduced to extract as much as information form seismic image for any specific reservoir or geological study. Modeling of faults and fractures are among the most attracted objects for interpretation in geological study on seismic images that several strategies have been presented for this specific purpose. In this study, we have presented a modified approach of ap...
متن کاملRetaining Customers Using Clustering and Association Rules in Insurance Industry: A Case Study
This study clusters customers and finds the characteristics of different groups in a life insurance company in order to find a way for prediction of customer behavior based on payment. The approach is to use clustering and association rules based on CRISP-DM methodology in data mining. The researcher could classify customers of each policy in three different clusters, using association rules. A...
متن کاملApplication of CAS wavelet to construct quadrature rules for numerical integration
In this paper, based on CAS wavelets we present quadrature rules for numerical solution of double and triple integrals with variable limits of integration. To construct new method, first, we approximate the unknown function by CAS wavelets. Then by using suitable collocation points, we obtain the CAS wavelet coefficients that these coefficients are applied in approximating the unk...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014